Goto

Collaborating Authors

 gaussian graphical model


Learning Gaussian Graphical Models under Total Positivity via Spectral Graph Sparsification

arXiv.org Machine Learning

Many practical data analysis tasks reduce to learning, from observed samples, how a collection of variables depend on each other. A widely used approach is to fit a Gaussian graphical model, which represents the dependence structure as a graph connecting the variables. In a number of important applications, such as financial returns, gene co-expression, and climate or network analysis, the dependencies tend to be positive: variables move together rather than offset each other. Encoding this positivity through the constraint of multivariate total positivity of order two (MTP2) yields an attractive estimator that produces accurate fits with no tuning required. The resulting graphs are, however, typically much denser than the underlying ground-truth model, which makes them hard to interpret and slow to use in any downstream task that operates on the graph. In this work, we propose a novel highly-scalable approach for learning Gaussian graphical models from data using spectral sparsification; we call it Spectral-MTP2. Spectral graph sparsification is a fundamental method which aims to preserve meaningful properties of a dense graph with a sparser subgraph. We theoretically and empirically investigate and validate our method, and show that learning Gaussian Graphical Models under MTP2 using spectral sparsification preserves MTP2 and approximates well the original model in terms of Kullback-Leibler divergence and Gaussian log-likelihood. In simulations and applications to equity returns and gene expression, we observe that Spectral-MTP2 retains most of the fit quality of the denser MTP2 baseline, while producing substantially sparser and more interpretable graphs.



Statistical Inference for Pairwise Graphical Models Using Score Matching

Neural Information Processing Systems

Probabilistic graphical models have been widely used to model complex systems and aid scientific discoveries. As a result, there is a large body of literature focused on consistent model selection. However, scientists are often interested in understanding uncertainty associated with the estimated parameters, which current literature has not addressed thoroughly. In this paper, we propose a novel estimator for edge parameters for pairwise graphical models based on Hyvärinen scoring rule. Hyvärinen scoring rule is especially useful in cases where the normalizing constant cannot be obtained efficiently in a closed form.





Learning Large-Scale MTP _2 Gaussian Graphical Models via Bridge-Block Decomposition

Neural Information Processing Systems

This paper studies the problem of learning the large-scale Gaussian graphical models that are multivariate totally positive of order two ($\text{MTP}_2$). By introducing the concept of bridge, which commonly exists in large-scale sparse graphs, we show that the entire problem can be equivalently optimized through (1) several smaller-scaled sub-problems induced by a \emph{bridge-block decomposition} on the thresholded sample covariance graph and (2) a set of explicit solutions on entries corresponding to \emph{bridges}. From practical aspect, this simple and provable discipline can be applied to break down a large problem into small tractable ones, leading to enormous reduction on the computational complexity and substantial improvements for all existing algorithms. The synthetic and real-world experiments demonstrate that our proposed method presents a significant speed-up compared to the state-of-the-art benchmarks.


Conditional Matrix Flows for Gaussian Graphical Models

Neural Information Processing Systems

Studying conditional independence among many variables with few observations is a challenging task.Gaussian Graphical Models (GGMs) tackle this problem by encouraging sparsity in the precision matrix through $l_q$ regularization with $q\leq1$.However, most GMMs rely on the $l_1$ norm because the objective is highly non-convex for sub-$l_1$ pseudo-norms.In the frequentist formulation, the $l_1$ norm relaxation provides the solution path as a function of the shrinkage parameter $\lambda$.In the Bayesian formulation, sparsity is instead encouraged through a Laplace prior, but posterior inference for different $\lambda$ requires repeated runs of expensive Gibbs samplers.Here we propose a general framework for variational inference with matrix-variate Normalizing Flow in GGMs, which unifies the benefits of frequentist and Bayesian frameworks.As a key improvement on previous work, we train with one flow a continuum of sparse regression models jointly for all regularization parameters $\lambda$ and all $l_q$ norms, including non-convex sub-$l_1$ pseudo-norms.Within one model we thus have access to (i) the evolution of the posterior for any $\lambda$ and any $l_q$ (pseudo-) norm, (ii) the marginal log-likelihood for model selection, and (iii) the frequentist solution paths through simulated annealing in the MAP limit.


Testing for Differences in Gaussian Graphical Models: Applications to Brain Connectivity

Neural Information Processing Systems

Functional brain networks are well described and estimated from data with Gaussian Graphical Models (GGMs), e.g.\ using sparse inverse covariance estimators. Comparing functional connectivity of subjects in two populations calls for comparing these estimated GGMs. Our goal is to identify differences in GGMs known to have similar structure. We characterize the uncertainty of differences with confidence intervals obtained using a parametric distribution on parameters of a sparse estimator. Sparse penalties enable statistical guarantees and interpretable models even in high-dimensional and low-sample settings. Characterizing the distributions of sparse models is inherently challenging as the penalties produce a biased estimator.